Two block cyclic reduction linear system solvers are consideredand implemented using the OpenCL framework. The topics ofinterest include a simplified scalar cyclic reduction tridiagonal systemsolver and the impact of increasing the radix-number of the algorithm.Both implementations are tested for the Poisson problem in two andthree dimensions, using a Nvidia GTX 580 series GPU and double precisionfloating-point arithmetic. The numerical results indicate up to 6-foldspeed increase in the case of the two-dimensional problems and up to 3-fold speed increase in the case of the three-dimensional problems whencompared to equivalent CPU implementations run on a Intel Core i7quad-core CPU.
展开▼